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(57) An efficient computation of low-dimensional lin- 
ear subspaces that optimally contain the set of images 
that are generated by varying the illumination impinging 
on the surface of a three-dimensional object for many 
different relative positions of that object and the viewing 
camera. The matrix elements of the spatial covariance 
matrix for an object are calculated for an arbitrary pre- 
determined distribution of illumination conditions. The 
maximum complexity is reduced for the model by ap- 



proximating any pair of normal-vector and albedo from 
the set of all such pairs of albedo and normals with the 
centers of the clusters that are the result of the vector 
quantization of this set. For an object, a viewpoint-inde- 
pendent covariance matrix whose complexity is large, 
but practical, is constructed and diagonalized off-line. A 
viewpoint-dependent covariance matrix is computed 
from the viewpoint-independent diagonalization results 
and is diagonalized online in real time. 
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Description 

BACKGROUND OF THE INVENTION 
5 Field of the Invention 

[0001] This invention relates to computer vision and more specifically to the efficient computation of the low-dimen- 
sional linear subspaces that optimally contain the set of images that are generated by varying the illumination impinging 
on the surface of a three-dimensional object, such as a human head, for many different relative positions of that object 
10 and the viewing camera. 



[0002] In a typical system for object recognition under varying viewpoint and illumination conditions, information 
15 about properties, such as the three-dimensional shape and the reflectance, of a plurality of objects 1 00, such as human 
faces and/or heads, is stored in a database 101. When such a system is in use, typically a plurality of queries 102, in 
the form of images of objects taken from non-fixed viewpoint and illumination conditions, is matched against said 
database 101. For each query image 103, matching 104 is performed against all objects 100 in said database 101. 
[0003] In order to match an individual object 105 against an individual query 103, several steps are typical. First, the 
20 viewpoint of the query is estimated 106. Second, a viewpoint-dependent illumination subspace 107 is generated, as 
outlined below, from: the three-dimensional data about each said object 1 00, and the determined viewpoint 1 06. Further, 
either the illumination condition is estimated, or a similarity score is generated, or both 108. 

[0004] The scores from the matching of said query 103 to said plurality of objects 100 are finally compared against 
each other to determine the best match 109 resulting in object 105 being recognized. 

25 [0005] The set of images of a given object under all possible illuminations, but fixed viewpoint, will be called the 
illumination subspace of the object for that viewpoint. It has been both observed in practice, and argued in theory, that 
said illumination subspace is mostly enclosed in a low-dimensional subspace, with dimensionality Mas low as M G. 
See P. Hallinan, "A Low-Dimensional Representation of Human Faces for Arbitrary Lighting Conditions", IEEE Con- 
ference on Computer Vision and Pattern Recognition, pp. 995-999 (1 994); and R. Basri et al., "Lambertian Reflectance 

30 and Linear Subspaces", to appear in the international Conference on Computer Vision, (July 2001). 

[0006] There are several object, and more specifically face, recognition algorithms that are based implicitly or ex- 
plicitly on this fact. See Georghiades et al., "From Few to Many: Generative Models for Recognition Under Variable 
Pose and Illumination", Proceedings of the 4 th International Conference on Automatic Face & Gesture Recognition, 
pp. 264-270, (March 2000); R. Ishiyama et al., "A New Face-Recognition System with Robustness Against Illumination 

35 Changes", IAPR Workshop on Machine Vision Applications, pp. 127-131 (November 2000); and Basri, et al. supra. 

Estimation of a Gaussian Probability Density by the Karhunen-Loeve Transform 

[0007] The method for finding a hierarchy of low-dimensional subspaces that optimally contain a given set of images, 
40 or more generally, snapshots, is the Karhunen-Loeve Transform (KLT), which is known under a variety of names - 
Principal Component Analysis, Empirical Eigenfunctions, and Singular Value Decomposition (SVD) - and is closely 
related to Linear Regression and Linear Factor Analysis, among others. SeeM. Tipping et al., "Mixures of Probabilistic 
Principal Component Analysers", Neural Computation, Vol. 11, No. 2, pp. 443-482 (February 1999). 
[0008] The basic facts about the KLT are summarized in this section. Of importance for the disclosure further are 
45 the implementation choices, the optimality properties, and the requirements for computational time and storage space. 
Also, frequent references to the equations herein will be made below. 

[0009] A snapshot, such as an image, will be represented by the intensity values c|) (x), where {x} is a pixel grid that 
contains ^/pixels. An ensemble of Tsnapshots will be denoted by {^'(x)}^^ Briefly, (see I. Joliffe, "Principal Component 
Analysis" (1986); and P. Penev, "Local Feature Analysis: A Statistical Theory for Information Representation and Trans- 
it? mission", PhD Thesis, The Rockefeller University, (1998)) its KLT representation is 



Prior Art 



55 




(1) 



where M= min(T, V) is the rank of the ensemble, {o^}is the (non-increasing) eigenspectrum of the "spatial 1 
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R <* y> =^2> ' w ' oo - £vr (*» rV,(y) (2) 



and the "temporal" 



C"'=^'(^(x)^a' r cy r (3) 

covariance matrices, and {y^x)}and {a'} are their respective orthonormal eigenvectors. When A^T^V, the diagonali- 
zation of C (eqn. 3) is the easier. 

15 [0010] Notably, the storage of C requires OCT 2 ) storage elements, and of R, 0( V 2 ) - the dependence of the storage 
requirements on the size of the problem is quadratic. Analogously, the time to compute the eigenvalues and eigenvec- 
tors (eqn. 2) of C is C^T 3 ), and of R, 0( V s ) - the dependence of the computational time on the size of the problem is 
cubic. In practical terms, this means that solving a system that is ten times as large requires a hundred 'times the space 
and a thousand times the computational power. 

20 [001 1 ] The average signal power of the ensemble is 



YWixT =trR = trR./ " 

TV 

25 * J r=1 



N 

KLT is optimal in the sense that, among all A/-dimensional subspaces (N < M), the subset of eigenmodes {¥,.} (eqn. 
2) span thesubspace which captures the most signal power, trR N . SeeM. Loeve, "Probability Theory", (195^); and I. 
Joliffe, supra. For a given dimensionality N, the reconstruction of the snapshot $ (x) is 

30 
35 

[0012] With the standard multidimensional Gaussian model for the probability density P[cf>] the information content 
of the reconstruction (eqn. 5) is 



-iog/k]~EKf (6) 

[0013] Notably, this model is spherical - the KLT coefficients (eqn. 1) are of unit variance (eqn, 3), (a') = 1 , and each 
45 of the N dimensions contributes equally to the information that is derived from the measurement, afthough only the 
leading dimensions contribute significantly to the signal power. 

[001 4] This is a manifestation of the fact that, in principle, even weak signals can be important if they are sufficiently 
rare and unexpected. In practice, however, signals are embedded in noise, which typically has constant power; and 
weak signals, even though important, may not be reliably detected. 

50 [0015] The situation is complicated further by the fact that in practice every estimation is done from a finite sample 
(T < o*,V < oo). Nevertheless, the shape of the eigenspectrum of sample-noise covariance matrices is known - it is 
determined by the ratio V/T(See J. Silverstein, "Eigenvalues and Eigenvectors of Large-Dimensional Sample Covar- 
iance Matrices", Contemporary Mathematics, Vol. 50, pp. 153-159 (1986); and A. Sengupta et al., "Distributions of 
Singular Values for Some Random Matrices", Physical Review E, Vol. 60, No. 3, pp. 3389-3392, (September 1999), 

55 and this knowledge can be used to recover the true spectrum of the signal through Bayesian estimation. SeeR. Everson 
et al., "Inferring the Eigenvalues of Covariance Matrices from Limited, Noisy Data", IEEE Transactions on Signal 
Processing, Vol 48, No. 7, pp. 2083-2091, (2000). Although this can serve as a basis for principled choice for the 
dimensionality, N, also called model selection, in the context of face recognition, this choice is typically guided by 
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heuristic arguments. See P. Penev et al., "The Global Dimensionality of Face Space", Proceedings of the 4 th Interna- 
tional Conference on Automatic Face & Gesture Recognition, pp. 264-270 (March 2000). 

Viewpoint-Dependent Subspaces for Face Recognition 

5 

[0016] At the heart of most high-level algorithms for computer vision is the question: How does an object look from 
a specific viewpoint and a specific illumination condition? 

[0017] Here we describe the standard method for finding the illumination subspace 107 - a small number of basis 
images 110 {\|//x)}(eqn. 2) which can be linearly admixed (eqn. 5) to approximate substantially all images of a given 

10 object under a fixed viewpoint, but any illumination condition. 

[0018] An illumination condition at a given point, x, on the surface of an object is defined by specifying the intensity 
of the light, L(x,n) that is coming to the point from the direction n £= S 2 , where n is a normal vector on the unit sphere 
S 2 centered at that point. Typically, the assumption is made that the light sources are sufficiently far away from the 
object - the distance to them is much larger than the size of the object - and therefore all points on the surface see the 

15 same illumination, L(n). 

[0019] In order to calculate the illumination subspace, it is customary to first generate a finite set of T illumination 
conditions {/.'(n)}^, and from them, a corresponding set of images {l t (x)}^ e7 -of the object the under a fixed viewpoint. 
When /'(x) is identified with c|)^from (eqn. 2), the Illumination subspace hierarchy is determined by (eqn. 3), and an 
application-dependent cutoff, N(ct eqn. 5), is chosen. 

20 [0020] There are two general ways to generate an image A (x) for a given illumination condition: to make an actual 
photograph of the physical object under the given illumination condition, or to use computer graphics techniques to 
render a 3D model of the object. See A. Georghiades, supra.; and R. Ishiyama et al., supra. 

[0021] Although the first method is more accurate for any given picture, it is very labor-intensive, and the number of 
pictures that can be done is necessarily small. Thus, the practical way to attempt the sampling of the space of illumi- 
25 nation conditions is to use synthetic images obtained by computer rendering of a 3D model of the object. 

[0022] When doing computer rendering, very popular is the assumption that the reflectance of the surface obeys the 
Lambertian reflectance model (See J. Lambert, "Photometria Sive de Mensura et Gradibus Luminus, Colorum et Um- 
brae," (1760)): the intensity of the reflected light /(x) at the image point x from a point light source, L which illuminates 
x without obstruction is assumed to be 

30 

l(x) = a(x)L*p(x) (7) 

where oc(x)is the intrinsic reflectance, or albedo, p(x) is the surface normal at the grid point x, and ••• denotes the dot 
35 product of the two vectors. 

[0023] Since the Lambertian model (eqn. 7) is linear, the sampling of the space of illumination conditions can be 
chosen to be simple - the P is rendered with a single point-light source illuminating the object from a pre-specified 
direction - n f : 

40 

L*(n) = fd(n - n) (8) 

where fi is the intensity of the light that comes from this direction. Moreover, the typical assumption is that light from 
any direction is equally probable, and is taken to be a constant, typically unity. 
45 [0024] Sufficient sampling of the illumination space is expensive. Currently, the space of images of a given object 
under varying illumination conditions but a fixed pose (camera geometry) is sampled by sampling the illumination 
conditions. These samples are used to estimate the full space. The current approach suffers from the following prob- 
lems: 

50 • sampling is expensive: either by varying the light physically and taking pictures, or by rendering a 3-dimensional 
model; 

• therefore, A) dense sampling is not performed, leading to an inaccurate model, or B) dense sampling is performed, 
which makes the system unacceptably slow. 

55 [0025] Therefore, currently the subspace approach cannot be practically applied to certain face recognition tasks. 
[0026] Dense sampling of the illumination space consumes impractically large time and memory. In order to make 
the model of the subspace more accurate, denser and denser sampling of the illumination conditions is necessary. 
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This entails quadratic growth of the necessary storage space and cubic growth in the time of finding the enclosing 
linear subspace. As the accuracy of the model increases, the complexity becomes limited by the number of pixels in 
the image; this complexity is impractically large. 

[0027] Direct computation of the covariance matrix is expensive. In the regime where the complexity of the model is 
5 limited by the number of pixels in the images, an amount of computation quadratic in the number of pixels is necessary 
to calculate the elements of the covariance matrix. This is only practical if the individual computation is fast. 
[0028] The computations for a given 3D object need to be done for arbitrary viewpoint. In typical situations, many 
3D objects need to be matched to a given query, for multiple queries. The viewpoints for which the respective enclosing 
linear subspaces need to be calculated are not known in advance, and change with every query/model pair. When the 
10 time for finding an illumination subspace for a given viewpoint is large, practical application of the method is impossible. 

SUMMARY OF THE INVENTION 

[0029] The disclosed method involves a series of steps. 

15 [0030] First, for any given object 105 in said database 101 , the matrix elements of the spatial covariance matrix 201 
are calculated for an arbitrary pre-determined distribution of illumination conditions. We disclose a method for such a 
calculation in which the matrix element for any given pair of pixels, for the most common assumption of illumination 
conditions, only depends on a scalar parameter - the planar angle between the surface normals at these pixels. More- 
over, we disclose a method for the efficient approximation and tabulation of this dependency 206. 

20 [0031] Second, for any given 3D object 105, we reduce the maximum complexity of the model by first constructing 
the set 205 of all pairs of an albedo and a normal vector - one pair from each point on the surface of the object - and 
then approximating every such pair with the corresponding centers of the clusters that are the result of vector quanti- 
zation 202 of this set. 

[0032] Third, for any 3D object 1 05, we construct and diagonalize off-line a viewpoint-independent covariance matrix 
25 201 whose complexity is large, but practical 203. Further, we store only the leading subspace of that matrix 204. 

[0033] Fourth, for any given viewpoint of a 3D object 105, the complexity of said viewpoint-dependent illumination 
subspace 107 is bounded from above by some small number, which is much larger than the actual complexity, but 
much smaller than the number of clusters from the previous step. A viewpoint-dependent covariance matrix 301 is 
computed from the viewpoint-independent diagonalization results 204 and is diagonalized online in real time 302. The 
30 leading subspace 303 of this covariance matrix is an approximation of the true solution 107. 

[0034] The present invention provides the following advantages that solves those problems with prior art methods. 

• The model is accurate because the illumination conditions are, in effect, sampled infinitely densely. 

• The complexity is reduced in an optimal way by vector-quantizing the set of actual normal vectors. 

35 • The calculation of the individual matrix element is fast, because it involves only the calculation of a cosine and a 
lookup of a table. 

• Sufficient precision is maintained throughout all viewpoint-independent calculations, which need to be done only 
once per 3D object, and therefore can be done off line. The real-time, viewpoint-dependent computations are 
performed in a low-dimensional space, which makes them practically feasible. 

40 

[0035] The invention allows the efficient computation of the low-dimensional linear subspaces that mostly contain 
the set of images that are generated by varying the illumination impinging on the surface of a 3-dimensional object, 
such as a human head, for many different viewpoints - relative positions of that object and the viewing camera. 
[0036] This is necessary in order to use in practice any object-recognition method that utilizes a low-dimensional 
45 viewpoint-dependent linear subspace. 

[0037] Apart from the primary use of the disclosed method as a necessary step in face recognition, it makes practically 
feasible the use of a low-dimensional viewpoint-dependent linear subspace for: the estimation of the lighting conditions 
of an image; the estimation of the pose of the 3D object in that image; the recognition of an object within a class of 
objects. 

50 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0038] 

55 Figs. 1 A and 1 B are general flowcharts of the prior-art matching process; 

Fig. 2 is a flowchart of the viewpoint-independent steps or the present invention; and 
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Fig. 3 is a flowchart of the viewpoint-dependent steps of the present invention. 
DETAILED DESCRIPTION OF THE INVENTION 

Fast Calculation of an Element of the Spatial Covariance Matrix for a Given Continuous Distribution of 
Illumination Conditions 

[0039] The results of any high-level computer-vision algorithm, such as determination of the light sources and rec- 
ognition of a 3-dimensional object, critically depend on the degree of correctness of the low-dimensional subspace 
107 in which the images of the target object 105 are believed to be contained. 

[0040] In order to build increasingly correct approximations of said subspace 107, increasingly finer samplings of 
the illumination conditions are necessary. This is, however, a problem, since for every sample point a new rendering 
of the object needs to be made, and the rendering operation is very expensive. Also, since the direction of the illumi- 
nation is parameterized by the points on a sphere - a two-dimensional surface - the number of samples on that surface, 
T, grows quadratically, 

T=oii/df), 

with the sampling quality - the inverse of the distance between the samples, d-and hence the storage space is 

oiii/drl 

and the computational requirements, 

That is, to increase the accuracy of the sampling ten times, the storage requirement is increased ten thousand times, 
and the required computational power, one million times. 

[0041] In order to solve the problem with the density of sampling of the illumination conditions, we disclose the 
following method for calculating the elements of the spatial covariance matrix 201 (eqn. 2) under infinitely dense sam- 
pling of space of illumination conditions (cf. eqn. 8). 

[0042] According to (eqn. 2) the same subspace can be found by either diagonalizing the "temporal" covariance 
matrix, C, or the "spatial" one, R. We disclose below a method to calculate, according to (eqn. 2), the element R(x,y) 
for infinitely dense sampling of the illumination conditions. 

[0043] When the image / is identified with and the point-source basis is used (eqn. 8), the individual element of 
the sum in (eqn. 2) that corresponds to a light source L, with intensity L = \\L\\ and coming from direction n = UL, is 

l L (x)l L (y) = a(x)a(y)L*p(x)L'q(y) - a(x)a(y)D(L,p,q) (9) 

[0044] Where D(L,p,q) depends on the light, and only on the value of the normals p and q, but not on which particular 
points of the image they appear at. Therefore, the matrix element R(x,y) is viewpoint-independent. 
[0045] In order to calculate D(L,p,q), we express the vectors in a coordinate system in which the z-axis is perpen- 
dicular to the plane defined by p and q, and the x-axis is along the bisector of the angle between them, 2G. In these 
coordinates 
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p=(c,+s, 0) 

q=(c,-s,0) (10) 
L = (L x ,L yi L z ) 

where c=cos6, and s= sine ; then /_• p = cL x + sL y ,L* q= cL x - sL y , and finally 

D{L,p,q)= c 2 4-(1-c 2 )L^L^2(c 2 ^-(1-c 2 )^)-L 2 C(n,c) (11) 



where L is the light intensity and n is the unit vector in the light direction. Notably, the result does not depend on the 
vectors p and q individually, but only on the cosine of the half-angle between them, c; this forms the foundation of the 
15 method for fast calculation of the matrix element R(x,y). Also note this result (eqn. 11) does not depend on the z- 
component of the light. 

jJ0046] We now calculate the matrix element R(x,y) from (eqn. 2). In the limit of infinitely dense sampling, the sum 
-Z t is replaced by an integral over all possible light directions, weighted by their respective intensities 



R(x,y)=a(x)a(y)j nesl l}(n)D(n t p,q) (12) 



where u x and u y are the unit vectors of the special coordinate system defined by p and q (eqn. 1 0). Depending on the 
symmetry of the assumptions forthe illumination conditions, E(u x ,u y ,c) depends on either three, two, or one parameter. 
Notably, it does not depend on the 3D object itself, and can be pre-calculated once for the whole database of objects. 
[0047] The most common case, which is also the most symmetric, allows a further simplification. With the assumption 
30 that there is no preferred direction in the illumination - light could come from any direction equally likely (cf. eqn. 8) -L 
(n) is a constant, and the integral (eqn. 12) no longer depends on the special coordinate system (eqn. 10); it depends 
only on c. In that case, it can be calculated in advance with great precision and stored in a one-dimensional table; this 
makes its use later essentially free. The final formula for the matrix element is then 



R(x,y) = a{x)a(y)E(c(p(x),q(y))) (13) 



where the only operations involved in the calculation are table lookups, one cosine function, and two multiplications. 
[0048] Notably, (eqn. 12) is much more generally applicable than the point-light-source illumination conditions that 
40 were assumed in its derivation. Suppose there is an a priori distribution of illumination conditions - the probability 
density that / is the light intensity that comes from a particular direction n G S 2 on the unit sphere, S 2 , is F[/(n)]; this 
density integrates to a distribution, a, which induces the measure do. Then, in the limit of infinitely dense sampling, 
thesumlEfis replaced by and integral overall possible illumination conditions, weighted by their respective probabilities 

45 

R(x,y) = a(x)CL(y)j le(il . ) s> dG^mDin.p^) 

= aCxXxOJb D(n,p,q) dol(n) (14) 
= a{xp.{y)\ nesl D{n, P) q)L(n) 

which is identical to (eqn. 12); L(n) is interpreted now as the average light intensity that comes from direction n. 
55 [0049] In summary, for any assumed or estimated distribution of illumination conditions, not necessarily from point 
light sources, the matrix element R(x,y) (eqn. 2) does not depend on the object in question, and therefore can be pre- 
calculated once for database 101 and stored 206 for successive use (eqn. 12). Also, in the most complicated case, a 
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3-dimensional table is necessary, which is practical. Moreover, in the most common case, only a one-dimensional table 
is needed (eqn. 13), which requires essentially zero computational and storage resources in practice. 

Monte-Carlo Calculation of the Integrals Involved 

5 

[0050] In order for (eqn. 12) to be useful in practice, there has to be a practical way to evaluate the integral. In the 
simplest case, when the illumination conditions are isotropic (eqn. 13), the integral is relatively simple 

10 E < C ) = J c2 t m y r/"*-^-^ < 15 > 

1U nern{C(n,c)>0} A y 

and can be evaluated analytically. Here, the condition C(n,c) >0 (eqn. 11) ensures that both points are simultaneously 
visible from the given direction, (n). In general, since the assumed light-source distribution Z_(n) can be anything, even 
something that has been experimentally measured and does not have an analytic form to begin with, it is of interest 
15 to be able to calculate (eqn. 12) numerically. Because the result is object-independent, this can be done just once for 
the whole database 101 , and stored 206. 

[0051] A feasible approach for the evaluation of (eqn. 15) is the Monte Carlo (MC) procedure for evaluating integrals. 
In this case, the same procedure is iterated over many random realizations £ G X. First a random vector is generated, 
/> G [0,1 ] 3 , that is uniformly distributed in the unit cube. The subset that is also in the unit sphere, 

20 

f&l<4 

25 is hence uniformly distributed there. Then, the normalized random vectors 

30 

are distributed uniformly on the surface of the unit sphere, S 2 . This procedure for generating uniformly distributed 
normals is relatively efficient - a large fraction, 4n/(3*8) « 52% of the vectors can be used to produce useful normal 
vectors, n^. 

[0052] The next step is to discard those light directions for which one or both of the surface points are not visible. 
35 This is achieved by the test for the condition C(n,c) > 0. For the rest of the directions, from which both surface points 
are visible, the integrand is evaluated and accumulated. 

[0053] Another computational saving can be achieved, which is grounded in the fact that the joint-visibility condition, 
C(n,c), is a monotonically increasing function of c. This allows one to find, for any random light direction, n^, the value 
c^\C(n^, = 0, which defines the border of the joint-visibility, and then update all intermediate results for the integrals 
40 {E(c)\c> c^}. Thus, the integrals (eqn. 15) are calculated simultaneously, in the same MC procedure. 

[0054] Finally the results are tabulated and stored 206 for the subsequent calculation of R(x,y) for said plurality of 
objects 101. 

Optimal Dimensionality Reduction By Vector Quantization 

45 

[0055] An efficient method is disclosed above to calculate the element R(x,y) of the spatial covariance matrix 201 
(eqn. 2) when the space of illumination conditions is sampled infinitely densely. In order for that method to be useful 
in practice, the covariance matrix R needs to be diagonalized in reasonable time. In a typical picture, there are V = 
480x640 » 300,000 pixels. Even if the object occupies only a fraction of the picture, say 1 0%, this still leaves too large 
50 a matrix to be subsequently diagonalized. Here we disclose a method to reduce the dimensionality of the problem and 
at the same time optimally retain the illumination subspace. 

[0056] This method is based on the fact that when the elements of a matrix are perturbed slightly, then the pertur- 
bations in the eigenvectors and eigenvalues are also small, and are bounded if the perturbation in the matrix elements 
is bounded. See R. Everson et al., supra. Below we disclose a method for finding afamily of optimally small perturbations 
55 that decrease the dimensionality of the system, and thus, the time to diagonalize it. 

[0057] Indeed, when the normal vector to the surface p(x) at the point x is close to some reference normal q, and 
also its albedo a(x) is close to a reference albedo a q , then the substitution of a(x) and p(x) with a q and q, respectively, 
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in (eqn. 13) would lead to a small perturbation of R(x,y). Then, if many image points have albedos and normals that 
are close to the same reference albedo and normal-vector pair, substituting all of them will lead to a series of small 
perturbations in the matrix elements of R, but will also reduce the degrees of freedom in the problem. 
[0058] Thus, a solution to the large-dimensionality problem is to find a set of Q pairs of reference values 207 for the 
5 albedo and normals, {(tt q ,q)} q(E Q such that, for a given value of Q, the perturbations are smallest. This is exactly the 
setting of a variety of vector quantization algorithms - given a set of vectors, {(a(x),p(x))} x ^ v in our case 205, these 
algorithms cluster them together in Q clusters, and find their centroids 207, {(a q ,q)} q( =Q, such that the average distance 
from the vectors to the nearest respective cluster-centroid, q(x) = q x , is minimal. 

[0059] There are many algorithms for VQ, and most are suitable as a step in the solution of our problem. We highlight 
10 two of them here as preferred practice: the Linde-Buzo-Gray algorithm, which benefits from a straightforward and 
conceptually clean implementation and has been found to perform well in practice, and the Deterministic Annealing 
(DA) algorithm which has better speed, convergence, and optimality properties, albeit with a more complicated imple- 
mentation. See Y. Linde et al., "An Algorithm for Vector Quantizer Design", IEEE Transactions on Communications, 
Vol. 28, pp. 84-95 (1980); and K. Rose, "Deterministic Annealing for Clustering, Compression, Classification, Regres- 
15 sion, and Related Optimization Problems", Proceedings of IEEE, Vol. 86, No. 11 , pp. 2210-2239 (1998). 

[0060] In summary, for any object 105 in said database 101 , the set is generated of all of pairs of normals and albedo 
205. Then, for any desired complexity, Q, this set is clustered into Q clusters 207, and the original matrix elements R 
(x,y) are perturbed, according to (eqn. 13), to 

20 

R(x,y) -> Ft Q (x,y) = R Q (q x ,q' y ) - & q a q ,E(c(q x ,q' y )) (16) 

[0061] Notably, the rank of the VxV matrix R Q (q x ,q' y ) is the same as that of the QxQ matrix 208 with elements 
jN q N' q R Q (q, q), where N q is the number of image points that are clustered together in the cluster q; this rank is at 

25 most Q, which makes the diagonalization practical, for moderately large values of Q. Moreover, the eigenvalues of 
these two matrices are the same, and their eigenvectors are simply related - once the smaller set is known 209, the 
larger set 210 is obtained by expansion in the number of elements and rescaling of their values. Therefore, the diag- 
onalization of R Q (eqn. 16) requires at most 0(Q 3 ) computational power, which is much better than C^V 3 ). It is practical 
for relatively large values of Q, on the order of 10,000. Notably, since this calculation has to be done only once per 

30 object 105, as opposed to once for every query 103 in said plurality of queries 102, it can be performed off line and 
the results stored 302. Also, since typically Q« V, the storage requirements, - for 0(Q2) elements, - are reasonable 
in practice. 

Optimal Viewpoint-Independent Dimensionality Reduction 

35 

[0062] A method is disclosed to reduce the dimensionality of the spatial covariance matrix 201 (eqn. 2), and at the 
same time preserve most of its eigen-structure 21 0. Although vector quantization with Q ~ 1 0,000 and the subsequent 
diagonalization of a system of that size is feasible, it cannot be performed in real time, nor, many times for every query 
103 - once for every different target object in said database 101 . In this section we disclose a method to speed up the 

40 calculation of the viewpoint-dependent illumination subspace 107. 

[0063] The method is based on the fact that the rendering process has two steps. On one hand, features on the 3D 
surface of the object are mapped to locations on the 2D image plane of the camera. That is, any property tp(t/, v) = tp 
(u) that is associated with the surface point u = (u, v) is mapped to a property on the image plane § (x), through the 
viewpoint-dependent warp x(u). This warp 304 is entirely a geometric entity, and depends only on the viewpoint. 

45 [0064] On the other hand, the surface properties cp interact with the light, and possibly the viewpoint, to give rise to 
image features Therefore, to the extent that the surface properties do not interact with the viewpoint, all calculations 
can be made in the viewpoint-independent coordinate system (u, v), and, at a later stage, warped to the image plane 
in a viewpoint-dependent manner. Hence, any basis of the viewpoint-dependent illumination subspace is a warp of a 
viewpoint-independent basis, defined on the surface of the object. Since the warps preserve neither the lengths of the 

50 vectors in the basis, nor the angles between them, a warp of a viewpoint-independent eigenbasis hierarchy (eqn. 2) 
does not necessarily result in a viewpoint-dependent eigenbasis hierarchy. Nevertheless, for any warp, there is region 
on the surface that maps more or less linearly to most of the area on the image plane. Hence, if the viewpoint-inde- 
pendent eigenbasis describes well that region on the surface, its warp will describe well the image. This is the foundation 
for the following method. 

55 [0065] For every object in said database 101, generate 201 and diagonalize 210 its vector-quantized viewpoint- 
independent covariance matrix 208, R Q , and from it, determine the eigenbasis hierarchy defined on the surface of the 
object 
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*(">v)~;f>>)a r V r (v) 

5 

[0066] Further, choose a cutoff, N, such that the average residual power, tr(R N ) (eqn. 4), is sufficiently small. Ratios 
of residual to noise power in the [0.1-1 0] range work well in practice. Keep only the first Neigensurfaces and store them 
for subsequent use 204. Because this computation is viewpoint-independent, it can be done off line, once per object 
105 in said database 101, and the results, stored 204. Typically, even though Q ~ 10,000 is practical as an off line 
10 computation, only the first N ~ 1 00 eigen-surfaces need to be preserved. 

[0067] Finally, at the viewpoint-dependent stage - when a query 103 at a particular viewpoint needs to be matched 
to the objects in said database - warp 304 the eigen-surfaces 204 (eqn. 1 7) to a basis 301 of the viewpoint-dependent 
illumination subspace 107 

V,W=f,W«)) (18) 
[0068] These final warps can be implemented as lookups in a table, which is very fast. 

20 

Efficient Generation of Many Viewpoint-Dependent Subspaces from the Viewpoint-Independent One 

[0069] An efficient method is disclosed to generate a low-dimensional basis of the viewpoint-dependent illumination 
subspace 107 (eqn. 18) from the pre-computed viewpoint-independent hierarchy of eigen-surfaces 204 (eqn. 17). 

25 Although this can be done for a dimensionality as low as N ~ 1 00, which is the recommended practice, reduction of N 
much further is not possible, because all areas of the surface need to be represented sufficiently well. On the other 
hand, there is theoretic (see R. Basri et al., supra.) and experimental evidence that the matching can be performed in 
a space with dimension M, 4 < M< 9. See P. Hallinan, supra.; A. Georghiades et al., supra.; R. Ishiyama et al., supra.; 
and R. Basri et al., supra. M is the final dimensionality of the viewpoint-dependent subspace in which recognition is 

30 performed. The value of M can be up to 20. In the preferred embodiment, M is between 4 and 9, but values up to 11 
have been used. In addition, the value of N should be such that 2M< N<8M. We disclose an efficient method to find 
the leading M-dimensional viewpoint-dependent eigen-subspace of the A/-dimensional viewpoint-dependent non-ei- 
gen-subspace. 

35 

R(x 9 y) = f^ p (x)^ p ^ p (y) (19) 

40 where 



(17) 



45 



cr^,(x) = X£/ pr y r (*) 



where o p and U pr are determined by the eigenvalue decomposition of an NxN matrix B rs where 



[0070] Since this viewpoint-dependent step requires a diagonalization of a matrix of dimensionality N~ 1 00, it is very 
fast and can be performed on line, many times, once for each trial match between the query 103 and the objects in 
55 said database 101. 

[0071] This allows the off-line computation and storage of the results 204 of the most time-consuming steps in the 
generation of the illumination subspace 107, once for every object 105 in said database 101, and their subsequent 
use for efficient fast on-line matching 306 of said plurality of queries 102. 
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[0072] While there has been shown and described what is considered to be preferred embodiments of the invention, 
it will, of course, be understood that various modifications and changes in form or detail could readily be made without 
departing from the spirit of the invention. It is therefore intended that the invention be not limited to the exact forms 
described and illustrated, but should be constructed to cover all modifications that may fall within the scope of the 
5 appended claims. 

Claims 

10 1 . A method to calculate an element R(x,y) of a viewpoint-independent spatial covariance matrix of a given object in 
a database for infinitely dense sampling of illumination conditions, the method comprising: 

calculating E(u x ,u y ,c), an integral over all possible light directions, weighted by their respective intensities, 
according to the equation: 



15 



25 



45 



55 



E(u x ,u y ,c) = j n ^_^ L(n)D(n,p,q) 



where neS 2 is a particular direction on the unit sphere S 2 , where L is the light intensity and n is a normal 
20 vector on the unit sphere S 2 , where u x and u y are the unit vectors of the special coordinate system defined by 

p and q, where p and q are vectors in which the x-axis is the bisector of the angle 2G between them, where p 
= (c,+s,0) and q= (c,-s,0), where c= cosG, s = sinG, and (x,y) is the image point; 
solving for the viewpoint-independent spatial covariance matrix R(x,y) according to the equation: 



R{x,y)=a(x)a(y)E(u x ,u y ,c) 



where a(x) is the intrinsic reflectance at the image point x and oc(y) is the intrinsic reflectance at the image 
point y; and 

30 tabulating and storing the result of said viewpoint-independent spatial covariance matrix R(x,y). 

2. A method to calculate an element R(x,y) of a viewpoint-independent spatial covariance matrix of a given object 
where there is no preferred direction in the illumination, the method comprising: 

35 calculating E(c(p(x),q(y))), according to the equation: 

E(C) = '^s 2 ^^^)^)^^ 1 -^^ 



40 where n^S 2 is a particular direction on the unit sphere S 2 , where C(n,c) is the joint-visibility condition, where 

n is the given direction, where p and q are vectors in which the x-axis is the bisector of the angle 2G between 
them, where p = (c,+s,0) and q = (c,-s,0), where c = cosG, s = sinG, and (x,y) is the image point; 
solving for the viewpoint-independent spatial covariance matrix R(x,y) according to the equation: 



R(x,y)=a(x)a(y)E(c(p(x),q(y))) 



where oc(x) is the intrinsic reflectance at the image point x and a(y) is the intrinsic reflectance at the image 
point y; and 

so tabulating and storing the result of said viewpoint-independent spatial covariance matrix R(x,y). 

3. The method of claim 2, comprising: 

using the Monte Carlo procedure to evaluate the integral for E(c) in claim 2. 



4. A method for finding a set of Q pairs of reference values for the albedo and normals {(a q ,q)} q(E Q, such that the 
perturbations are smallest for a given value of Q, the method comprising: 
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using vector quantization algorithms to cluster a set of vectors {(cc(x),p(x))} xe ^ together in Q clusters; 
finding the centroids {(oL q ,q)} q ^Q of said clusters such that the average distance from the vectors to the nearest 
respective cluster-centroid q(x)=q x is minimal, where is a reference albedo to an albedo a(x), q is a refer- 
ence normal close to a vector normal to the surface p(x) and x is a point on the surface of an object; and 
storing the results. 

The method of claim 4, comprising: 

using a Linde-Buzo-Gray algorithm as said vector quantization algorithm. 
The method of claim 4, comprising: 

using a Deterministic Annealing algorithm as said vector quantization algorithm. 

A method to generate a low-dimensional basis of the viewpoint-dependent illumination subspace from the pre- 
computed viewpoint-independent hierarchy of eigen-surfaces, the method comprising: 

generating and diagonalizing said object's vector-quantized viewpoint-independent covariance matrix R Q ; 
determining the eigenbasis hierarchy defined on the surface of the object from the results of R Q , using the 
equation: 



where Q is the desired complexity, where a r 2 is the non-increasing eigenspectrum of the spatial and the tem- 
poral covariance matrices, where y^t/) and y/v) are the respective orthonormal eigenvectors of said spatial 
and temporal covariance matrices, and (u,v) is a point on the surface of the object; 
choosing a cutoff N, such that the average residual signal power trR N , is between the values of 0.1 - 1 0; 
keeping only the first N eigensurfaces and storing the results for subsequent use; 

warping said stored N eigensurfaces to a basis of the viewpoint-dependent illumination subspace at the view- 
point-dependent stage using the equation: 



when a query at a particular viewpoint needs to be matched to the objects in said database; and 
storing the results. 

The method of claim 7, further comprising the step of finding M, the final dimensionality of the viewpoint-dependent 
eigen-subspace, the method comprising: 

warping a viewpoint dependent, non-eigen-subspace of dimensionality N from a viewpoint independent eigen- 
subspace of dimensionality N; and 

finding the leading M-dimensional viewpoint dependent eigen-subspace of the N-dimensional viewpoint de- 
pendent non-eigen-subspace resulting from said warp, according to the equation: 




where 
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_ N 

where a p and U pr are determined by the eigenvalue decomposition of an NxN matrix B rs where 

X 

. The method of claim 8, comprising: 

the final dimensionality of the viewpoint dependent subspace M, having a value up to 20. 

0. The method of claim 9, comprising: 

the final dimensionality of the viewpoint dependent subspace M, having a value between 4 and 9, inclusive. 

1 . The method of claim 8, comprising: 

having the value of N not less than 2 times the value of M, and not more than 8 times the value of M. 
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